7 research outputs found

    Algebraic Methods in the Congested Clique

    Full text link
    In this work, we use algebraic methods for studying distance computation and subgraph detection tasks in the congested clique model. Specifically, we adapt parallel matrix multiplication implementations to the congested clique, obtaining an O(n1−2/ω)O(n^{1-2/\omega}) round matrix multiplication algorithm, where ω<2.3728639\omega < 2.3728639 is the exponent of matrix multiplication. In conjunction with known techniques from centralised algorithmics, this gives significant improvements over previous best upper bounds in the congested clique model. The highlight results include: -- triangle and 4-cycle counting in O(n0.158)O(n^{0.158}) rounds, improving upon the O(n1/3)O(n^{1/3}) triangle detection algorithm of Dolev et al. [DISC 2012], -- a (1+o(1))(1 + o(1))-approximation of all-pairs shortest paths in O(n0.158)O(n^{0.158}) rounds, improving upon the O~(n1/2)\tilde{O} (n^{1/2})-round (2+o(1))(2 + o(1))-approximation algorithm of Nanongkai [STOC 2014], and -- computing the girth in O(n0.158)O(n^{0.158}) rounds, which is the first non-trivial solution in this model. In addition, we present a novel constant-round combinatorial algorithm for detecting 4-cycles.Comment: This is work is a merger of arxiv:1412.2109 and arxiv:1412.266

    The bulk-synchronous parallel random access machine

    Get PDF
    The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. Originally, BSP was defined as a distributed memory model. Shared-memory style BSP programming had to be provided by PRAM simulation. However, this approach destroys data locality and therefore may prove inefficient for many practical problems. In this paper we present a new BSP-type model, called BSPRAM, which reconciles sharedmemory style programming with efficient exploitation of data locality. BSPRAM can be optimally simulated by BSP for a broad range of algorithms. We identify some characteristic properties of such algorithms: obliviousness, slackness, granularity. Finally, we illustrate these concepts by presenting BSPRAM algorithms for butterfly dag computation, cube dag computation, dense matrix multiplication and sorting

    The Design and Analysis of Bulk-Synchronous Parallel Algorithms

    No full text
    The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. This thesis presents a systematic approach to the design and analysis of BSP algorithms. We introduce an extension of the BSP model, called BSPRAM, which reconciles shared-memory style programming with efficient exploitation of data locality. The BSPRAM model can be optimally simulated by a BSP computer for a broad range of algorithms possessing certain characteristic properties: obliviousness, slackness, granularity. We use BSPRAM to design BSP algorithms for problems from three large, partially overlapping domains: combinatorial computation, dense matrix computation, graph computation. Some of the presented algorithms are adapted from known BSP algorithms (butterfly dag computation, cube dag computation, matrix multiplication). Other algorithms are obtained by application of established non-BSP techniques (sorting, randomised list contraction, Gaussian elimination without pivoting and with column pivoting, algebraic path computation), or use original techniques specific to the BSP model (deterministic list contraction, Gaussian elimination with nested block pivoting, communication-efficient multiplication of Boolean matrices, synchronisation-efficient shortest paths computation). The asymptotic BSP cost of each algorithm is established, along with its BSPRAM characteristics. We conclude by outlining some directions for future research

    Parallel Priority Queue and List Contraction: The BSP Approach

    No full text
    . In this paper we present efficient and practical extensions of the randomized Parallel Priority Queue (PPQ) algorithms of Ranade et al., and efficient randomized and deterministic algorithms for the problem of list contraction on the Bulk-Synchronous Parallel (BSP) model. We also present an experimental study of their performance. We show that our algorithms are communication efficient and achieve small multiplicative constant factors for a wide range of parallel machines. 1 Introduction We present an architecture independent study of the computation and communication requirements of an efficient Parallel Priority Queue (PPQ) implementation and list contraction algorithms along with an experimental study. The computational model adopted is the Bulk-Synchronous Parallel (BSP) model, proposed by L. G. Valiant [20], which deals explicitly with the notion of communication and synchronization among computational threads. A detailed discussion of the BSP model appears in [20]. The first a..

    Parallel Priority Queue and List Contraction: The BSP Approach

    No full text
    In this work we present efficient and practical randomized data structures on the Bulk-Synchronous Parallel (BSP) model of computation along with an experimental study of their performance. In particular, we study data structures for the realization of Parallel Priority Queues (PPQs). We show that our algorithms are communication efficient and achieve optimality to within small multiplicative constant factors for a wide range of parallel machines. We also present an experimental study of our PPQ algorithms on a Cray T3D\@. Finally, we present new randomized and deterministic BSP algorithms for list and tree contraction
    corecore